optenemos las nombres de las empresas que conforman el S&P-500 de wikipedia
sp500 = pd.read_html('https://en.wikipedia.org/wiki/List_of_S%26P_500_companies')[0]
sp500["Symbol"]=sp500["Symbol"].map(lambda x: x.replace(".","-"))
sp500_symbols = sp500["Symbol"].to_list()
Una vez optenidos estos nombres de las empresas que conforman el S&P-500, usamos el API de Yahoo Finance para optener sus datos, desde el año 2000 hasta el año 2021
df_sp500 = yf.download(tickers = sp500_symbols, start = "2000-01-01",
end = "2021-12-31", interval = "1d")
sp500_symbols = df_sp500["Close"].columns.to_list()
df = df_sp500.sort_index()
df.head()
[*********************100%***********************] 503 of 503 completed 1 Failed download: - CEG: Data doesn't exist for startDate = 946702800, endDate = 1640926800
| Adj Close | ... | Volume | |||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| A | AAL | AAP | AAPL | ABBV | ABC | ABMD | ABT | ACN | ADBE | ... | WYNN | XEL | XOM | XRAY | XYL | YUM | ZBH | ZBRA | ZION | ZTS | |
| Date | |||||||||||||||||||||
| 2000-01-03 00:00:00 | 44.221336 | NaN | NaN | 0.854541 | NaN | 3.016691 | 18.25000 | 9.347849 | NaN | 16.274672 | ... | NaN | 2738600.0 | 13458200.0 | 582300.0 | NaN | 3033493.0 | NaN | 1055700.0 | 1199600.0 | NaN |
| 2000-01-04 00:00:00 | 40.843319 | NaN | NaN | 0.782495 | NaN | 2.810732 | 17.81250 | 9.080766 | NaN | 14.909397 | ... | NaN | 425200.0 | 14510800.0 | 317700.0 | NaN | 3315031.0 | NaN | 522450.0 | 816100.0 | NaN |
| 2000-01-05 00:00:00 | 38.309811 | NaN | NaN | 0.793945 | NaN | 3.028807 | 18.00000 | 9.064076 | NaN | 15.204173 | ... | NaN | 500200.0 | 17485000.0 | 1188000.0 | NaN | 4642602.0 | NaN | 612225.0 | 1124700.0 | NaN |
| 2000-01-06 00:00:00 | 36.851116 | NaN | NaN | 0.725238 | NaN | 3.258996 | 18.03125 | 9.381233 | NaN | 15.328290 | ... | NaN | 344100.0 | 19461600.0 | 534300.0 | NaN | 3947658.0 | NaN | 263925.0 | 1112100.0 | NaN |
| 2000-01-07 00:00:00 | 39.922035 | NaN | NaN | 0.759592 | NaN | 3.683030 | 17.93750 | 9.481391 | NaN | 16.072985 | ... | NaN | 469500.0 | 16603800.0 | 1401000.0 | NaN | 6063647.0 | NaN | 333900.0 | 782000.0 | NaN |
5 rows × 3018 columns
gap_returns = np.log(df["Open"]/df["Close"].shift(1))
intraday_returns = np.log(df["Close"]/df["Open"])
df_variation = df["Adj Close"].pct_change()
df_volatility=df_variation.rolling(250).std()*100*np.sqrt(250)
weekday = gap_returns.index.map(lambda x: x.weekday())
best_day=pd.concat([
gap_returns.groupby(weekday).mean().T.mean().rename("Gap_return mean"),
gap_returns.groupby(weekday).std().T.mean().rename("Gap_return std"),
intraday_returns.groupby(weekday).mean().T.mean().rename("IntraDay_return mean"),
intraday_returns.groupby(weekday).std().T.mean().rename("IntraDay_return std"),
df_volatility.groupby(weekday).mean().T.mean().rename("Volatility"),
],axis=1)
best_day.reset_index(inplace=True)
best_day["Date"] = best_day["Date"].map({0:"Mon",1:"Tue",2:"Wed",3:"Thu",4:"Fri"})
best_day.rename(columns={"Date":"Day"},inplace=True)
best_day
| Day | Gap_return mean | Gap_return std | IntraDay_return mean | IntraDay_return std | Volatility | |
|---|---|---|---|---|---|---|
| 0 | Mon | 0.000089 | 0.012936 | -0.000298 | 0.019721 | 32.949131 |
| 1 | Tue | 0.000649 | 0.011787 | 0.000122 | 0.019740 | 32.913444 |
| 2 | Wed | 0.000166 | 0.011617 | 0.000226 | 0.020011 | 32.908567 |
| 3 | Thu | 0.000041 | 0.011965 | 0.000558 | 0.020366 | 32.877352 |
| 4 | Fri | 0.000288 | 0.012269 | 0.000276 | 0.018750 | 32.765478 |
fig, axs = plt.subplots(1,2, figsize=(20,5))
sns.barplot(x=best_day["Day"],y=best_day["Gap_return mean"],ax=axs[0]);
axs[0].set_title("Mean Gap Return per Day of the Week");
sns.barplot(x=best_day["Day"],y=best_day["IntraDay_return mean"],ci=best_day["IntraDay_return std"],ax=axs[1]);
axs[1].set_title("Mean IntraDay Return per Day of the Week");
Vix = df_volatility.T.mean().dropna()
fig = px.line(x=Vix.index, y=Vix, title="S&P-500 Volatility Over Time",labels=dict(x="Date", y="Volatility"))
fig.show()
Vemos que en los años 2000, ya vienen de una alta volatilidad, probablemente causada por la Burbuja de las .com.
En en 2007 vuelve a haber una alta volatilidad en el S&P-500, debido a la Crisis de las hipotecas subprime, hasta fines del 2010.
Finalmente el ultimo momento de alta volatilidad en el mercado fue en 2020, devido a la pandemia mundial causada por el Covid19
df_perCompany=pd.DataFrame( sp500[['Symbol', 'GICS Sector']])
df_perCompany.rename(columns={"Symbol":"Ticker"},inplace=True)
for ticker in sp500_symbols:
df_adjClose_ticker=df["Adj Close"][ticker].dropna()
if df_adjClose_ticker.shape[0]==0:
continue
year_index = df_adjClose_ticker.index.map(lambda x: x.year)
first_close, last_close = df_adjClose_ticker.iloc[[0,-1]]
total_return = (last_close/first_close)-1
first_year = df_adjClose_ticker.index[0].year
last_year = df_adjClose_ticker.index[-1].year
years=last_year-first_year+1
returnPerYear=[]
for year in range(first_year,last_year+1):
first_close_year, last_close_year = df_adjClose_ticker[year_index==year].iloc[[0,-1]]
year_return= (last_close_year/first_close_year)-1
returnPerYear.append(year_return)
mean_return_per_year = np.mean(returnPerYear)
volatility = np.std(returnPerYear)
df_perCompany.loc[df_perCompany["Ticker"]==ticker,["years","total_return","mean_return_per_year","volatility"]]=years,total_return,mean_return_per_year,volatility
df_perCompany
| Ticker | GICS Sector | years | total_return | mean_return_per_year | volatility | |
|---|---|---|---|---|---|---|
| 0 | MMM | Industrials | 22.0 | 5.588992 | 0.104859 | 0.196200 |
| 1 | AOS | Industrials | 22.0 | 33.993191 | 0.199614 | 0.257195 |
| 2 | ABT | Health Care | 22.0 | 13.970998 | 0.146395 | 0.180032 |
| 3 | ABBV | Health Care | 9.0 | 4.608987 | 0.240326 | 0.229127 |
| 4 | ABMD | Health Care | 22.0 | 18.838904 | 0.314810 | 0.580944 |
| ... | ... | ... | ... | ... | ... | ... |
| 498 | YUM | Consumer Discretionary | 22.0 | 27.333428 | 0.182262 | 0.177138 |
| 499 | ZBRA | Information Technology | 22.0 | 22.875070 | 0.196867 | 0.308217 |
| 500 | ZBH | Health Care | 21.0 | 3.691292 | 0.106400 | 0.264423 |
| 501 | ZION | Financials | 22.0 | 0.618679 | 0.079462 | 0.352188 |
| 502 | ZTS | Health Care | 9.0 | 7.405851 | 0.281776 | 0.167108 |
503 rows × 6 columns
Calculamos nuestro propio Ratio
$$ MyRatio = \frac{totalReturn}{years}*\frac{meanReturnPerYear}{volatility - RF} $$
$ RF = 0.01/255 $
Rf = 0.01/255
df_perCompany["Return_Volatility_Ratio"] = (df_perCompany["mean_return_per_year"]*df_perCompany["total_return"])/((df_perCompany["volatility"]-Rf)*df_perCompany["years"])
df_perCompany
| Ticker | GICS Sector | years | total_return | mean_return_per_year | volatility | Return_Volatility_Ratio | |
|---|---|---|---|---|---|---|---|
| 0 | MMM | Industrials | 22.0 | 5.588992 | 0.104859 | 0.196200 | 0.135801 |
| 1 | AOS | Industrials | 22.0 | 33.993191 | 0.199614 | 0.257195 | 1.199401 |
| 2 | ABT | Health Care | 22.0 | 13.970998 | 0.146395 | 0.180032 | 0.516507 |
| 3 | ABBV | Health Care | 9.0 | 4.608987 | 0.240326 | 0.229127 | 0.537230 |
| 4 | ABMD | Health Care | 22.0 | 18.838904 | 0.314810 | 0.580944 | 0.464063 |
| ... | ... | ... | ... | ... | ... | ... | ... |
| 498 | YUM | Consumer Discretionary | 22.0 | 27.333428 | 0.182262 | 0.177138 | 1.278647 |
| 499 | ZBRA | Information Technology | 22.0 | 22.875070 | 0.196867 | 0.308217 | 0.664219 |
| 500 | ZBH | Health Care | 21.0 | 3.691292 | 0.106400 | 0.264423 | 0.070740 |
| 501 | ZION | Financials | 22.0 | 0.618679 | 0.079462 | 0.352188 | 0.006346 |
| 502 | ZTS | Health Care | 9.0 | 7.405851 | 0.281776 | 0.167108 | 1.387850 |
502 rows × 7 columns
top9_companies=df_perCompany.sort_values(by="Return_Volatility_Ratio",ascending=False)[0:9]
top9_companies
| Ticker | GICS Sector | years | total_return | mean_return_per_year | volatility | Return_Volatility_Ratio | |
|---|---|---|---|---|---|---|---|
| 322 | MNST | Consumer Staples | 22.0 | 1060.729174 | 0.538903 | 0.935435 | 27.777751 |
| 352 | ODFL | Industrials | 22.0 | 366.419743 | 0.341535 | 0.348595 | 16.319983 |
| 330 | NFLX | Communication Services | 20.0 | 510.597445 | 0.665004 | 1.127123 | 15.063184 |
| 443 | TSLA | Consumer Discretionary | 12.0 | 223.014230 | 1.055824 | 2.039080 | 9.623143 |
| 347 | NVDA | Information Technology | 22.0 | 329.243697 | 0.597267 | 0.943422 | 9.474912 |
| 245 | IDXX | Health Care | 22.0 | 163.686694 | 0.285135 | 0.243441 | 8.716008 |
| 448 | TSCO | Consumer Discretionary | 22.0 | 270.747039 | 0.409297 | 0.692645 | 7.272673 |
| 176 | EPAM | Information Technology | 10.0 | 47.081430 | 0.525872 | 0.343443 | 7.209812 |
| 376 | POOL | Consumer Discretionary | 22.0 | 139.707631 | 0.289053 | 0.272894 | 6.727335 |
fig, axs=plt.subplots(1,1,figsize=(15,5))
sns.barplot(x=top9_companies["Ticker"],y=top9_companies["Return_Volatility_Ratio"]);
fig = px.sunburst(top9_companies, path=['GICS Sector', 'Ticker'], values='Return_Volatility_Ratio',
color='total_return')
fig.show()
df_perSector=df_perCompany.groupby("GICS Sector").mean()
Rf = 0.01/255
df_perSector["Return_Volatility_Ratio"] = (df_perSector["mean_return_per_year"]*df_perSector["total_return"])/((df_perSector["volatility"]-Rf)*df_perSector["years"])
df_perSector.sort_values("Return_Volatility_Ratio",ascending=False,inplace=True)
df_perSector
| years | total_return | mean_return_per_year | volatility | Return_Volatility_Ratio | |
|---|---|---|---|---|---|
| GICS Sector | |||||
| Consumer Staples | 20.545455 | 41.317334 | 0.132782 | 0.215313 | 1.240415 |
| Consumer Discretionary | 19.810345 | 33.109084 | 0.243166 | 0.439972 | 0.923787 |
| Health Care | 20.203125 | 28.735543 | 0.224978 | 0.378442 | 0.845642 |
| Communication Services | 16.961538 | 29.852878 | 0.153494 | 0.345736 | 0.781482 |
| Industrials | 19.328767 | 21.458846 | 0.190819 | 0.280363 | 0.755725 |
| Information Technology | 18.648649 | 24.796384 | 0.250625 | 0.441870 | 0.754240 |
| Real Estate | 21.032258 | 16.081849 | 0.198249 | 0.341831 | 0.443505 |
| Materials | 18.964286 | 13.308978 | 0.161402 | 0.305811 | 0.370441 |
| Financials | 20.621212 | 11.283377 | 0.144718 | 0.295260 | 0.268225 |
| Utilities | 21.607143 | 9.062460 | 0.134937 | 0.228246 | 0.248000 |
| Energy | 19.809524 | 8.060907 | 0.156099 | 0.402859 | 0.157689 |
fig, axs=plt.subplots(1,1,figsize=(30,8))
sns.barplot(x=df_perSector.index, y=df_perSector["Return_Volatility_Ratio"]);
min_ratio=df_perCompany["total_return"].min()
max_ratio=df_perCompany["total_return"].max()
total_return_scale = (df_perCompany["total_return"]+1-min_ratio)/(max_ratio-min_ratio)
fig = px.sunburst(df_perCompany, path=['GICS Sector',"Ticker"], values=total_return_scale,
color='volatility')
fig.show()